Do Computers Lie?

We have been constantly told this statement “Computers don’t lie”. Yes in fact Computers don’t lie, but neither does it speak the truth. A computer does what its Master programs it to do. Similarly, A model wouldn’t lie unless the Machine Learning Engineer doesn’t want it to lie.

Machine Bias

There was a nice episode of the podcast You are not so smart came out last year. This is an excerpt from it:

“I want a machine-learning algorithm to learn what tumors looked like in the past, and I want it to become biased toward selecting those kind of tumors in the future,” explains philosopher Shannon Vallor at Santa Clara University. “But I don’t want a machine-learning algorithm to learn what successful engineers and doctors looked like in the past and then become biased toward selecting those kinds of people when sorting and ranking resumes.”

The Problem

Machine Bias can occur due to a lot of factors but a few to name is:

Below is an example of how Google Translate, when translated the following text to a Gender-neutral langauge and back to English - applies its bias (primarily due to the nature of biased Training Dataset)

img

img

The Solution

The first step of finding solution to any problem is accepting The Problem exists. Let’s accept that fact and see how to use Kaggle Survey results and help the community tackle Machine Bias.

Libraries

Ignorance is Bliss - but not always!

The above plot is to demonstrate how much these questions that are about Model Fairness / Bias, have been ignored.

While asking about Salary made 15% of respondents to not answer, Questions about Reproducibility, Explainability and Bias made 37% of respondents to skip answering. The salary question comparsion is here to show relatively worse questions like this are approached.

Reproducibility, Explainability and Bias

To get a better perspective of the volume of the respondents, below is the same plot as above but with absolute numbers of respondents and their options.

Fairness and Bias:

key No opinion; I do not know Not at all important Slightly important Very important
Being able to explain ML model outputs and/or predictions 2.9% 1.6% 17.0% 41.1% 37.4%
Fairness and bias in ML algorithms: 5.4% 2.3% 19.0% 36.0% 37.4%
Reproducibility in data science 3.8% 1.0% 14.9% 42.9% 37.4%

Model Bias & Model Fairness

Gender

they %>% group_by(`What is your gender? - Selected Choice`) %>% count() %>% ungroup() %>% 
  rename("Gender" = `What is your gender? - Selected Choice`) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
   ggplot() + geom_col(aes(Gender,n, fill = Gender), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Gender, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
    scale_fill_viridis(discrete = T, option = "E") +

  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Gender",
       y = "Percentage of Respondents (other than NAs)") -> p1



not_they %>% group_by(`What is your gender? - Selected Choice`) %>% count() %>% ungroup() %>% 
  rename("Gender" = `What is your gender? - Selected Choice`) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
    ggplot() + geom_col(aes(Gender,n, fill = Gender), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Gender, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
    scale_fill_viridis(discrete = T, option = "E") +

  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "Not They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Gender",
       y = "Percentage of Respondents (other than NAs)") -> p2


cowplot::plot_grid(p1,p2)

Let us a create a new KPI index called They - Not They Ratio to give a different perspective to this comparison.

  • There is a difference of 5.1 PP Female Percentage difference between those who perceive Model Fariness & Bias in ML is Very Important and Others.

  • While this could be seen as that Female Gender usually gets affected by these Biases, It’s also important to realize that Male Gender (Kaggler’s) don’t echo similar sentiment as their female counterpart. After all, A healthy model is what we all want, don’t we?

Age

Age doesn’t seem to give anything straightway, which probably could be due to a lot of different age brackets. Let us try a bit of engineering to club them into two groups < 30 and > 30.

they %>% 
  mutate(age_grp = ifelse(parse_number(`What is your age (# years)?`) < 30,
                          "Less than 30",
                          "30+")) %>% 
  group_by(age_grp) %>% count() %>% ungroup() %>% 
  rename("Age" = age_grp) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
   ggplot() + geom_col(aes(Age,n, fill = Age), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Age, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
    scale_fill_viridis(discrete = T, option = "E") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Age",
       y = "Percentage of Respondents (other than NAs)") -> p1



not_they %>% 
   mutate(age_grp = ifelse(parse_number(`What is your age (# years)?`) < 30,
                          "Less than 30",
                          "30+")) %>% 
  group_by(age_grp) %>% count() %>% ungroup() %>% 
  rename("Age" = age_grp) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
    ggplot() + geom_col(aes(Age,n, fill = Age), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Age, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
    scale_fill_viridis(discrete = T, option = "E") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "Not They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Age",
       y = "Percentage of Respondents (other than NAs)") -> p2


cowplot::plot_grid(p1,p2)

This plot helps us say that the younger ones need to be updated with the implications of Model Bias and Fairness more than their older counterparts. That leads us to another important section of what they do.

Student vs Professionals

they %>% 
  mutate(title = ifelse(`Select the title most similar to your current role (or most recent title if retired): - Selected Choice` == "Student",
                          "Student",
                          "Professional")) %>% 
  group_by(title) %>% count() %>% ungroup() %>% 
  rename("Title" = title) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
   ggplot() + geom_col(aes(Title,n, fill = Title), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Title, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
    scale_fill_viridis(discrete = T, option = "C") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Title",
       y = "Percentage of Respondents (other than NAs)") -> p1


not_they %>% 
  mutate(title = ifelse(`Select the title most similar to your current role (or most recent title if retired): - Selected Choice` == "Student",
                          "Student",
                          "Professional")) %>% 
  group_by(title) %>% count() %>% ungroup() %>% 
  rename("Title" = title) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
    ggplot() + geom_col(aes(Title,n, fill = Title), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = Title, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
  scale_fill_viridis(discrete = T, option = "C") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "Not They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Title",
       y = "Percentage of Respondents (other than NAs)") -> p2


cowplot::plot_grid(p1,p2)

  • As with the previouus insight, once again it comes that Students need to be educated with the concepts of Model Bias and Fairness since there is ~4 PP difference between Students in They vs Not They.

Undergraduate Majors

they %>% 
  mutate(UG = ifelse(`Which best describes your undergraduate major? - Selected Choice` %in% c("Computer science (software engineering, etc.)","Information technology, networking, or system administration"),
                          "CS",
                          "Non_CS")) %>% 
  group_by(UG) %>% count() %>% ungroup() %>% 
  rename("UG" = UG) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
   ggplot() + geom_col(aes(UG,n, fill = UG), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = UG, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
  scale_fill_viridis(discrete = T, option = "D") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Title",
       y = "Percentage of Respondents (other than NAs)") -> p1


not_they %>% 
   mutate(UG = ifelse(`Which best describes your undergraduate major? - Selected Choice` %in% c("Computer science (software engineering, etc.)","Information technology, networking, or system administration"),
                          "CS",
                          "Non_CS")) %>% 
  group_by(UG) %>% count() %>% ungroup() %>% 
  rename("UG" = UG) %>% 
  mutate(n = n / sum(n),
         perc = percent(n)) %>% 
   ggplot() + geom_col(aes(UG,n, fill = UG), stat = "identity", show.legend = FALSE) +
   geom_label(aes(x = UG, y = n - 0.05, label = percent(n)),
           # hjust=0, vjust=0, size = 4, colour = 'black',
            fontface = 'bold') +
  scale_fill_viridis(discrete = T, option = "D") +
  scale_y_continuous(labels = percent_format()) +
  theme_minimal() + 
  theme(axis.text = element_text(angle = 45, size = 6)) +
  labs(title = "Not They",
       subtitle = "Perception on Fairness and Model Bias ",
       x = "Title",
       y = "Percentage of Respondents (other than NAs)") -> p2


cowplot::plot_grid(p1,p2)

  • Comptuer Science / IT Engineers have a difference of ~ 3.4PP between They and Not They, which shows that’s needs more care in the mindset change than Non-CS (even though that’s also required) but at least this can help prioritise where to start with any campaigning

R vs Python vs more

  • R users tend to be perceive Model Bias and Fairness as Very Important than their Python Counterparts.

All Countries

Countries with at least 100 respondents

  • Chile leads the pack if all the countries are considered otherwise if only the countries with more than 100 respondents are selected, South Africa followed by Nigeria are leading the pack of a healthy They to Not They ratio.

Work Industry

  • Kagglers in Industries like Non_Profit/Service and Government/Public Service have been better perception about the importance of Model Fairness and Bias.

  • The above plot once again empahsis the importance of Model Bias and Fairness education among students.
  • It’s also unhealthy to see places like Military and Internet-based Services falling behind as those are the places where the model evaluation is crucial and can have serious outcomes.

Data Scientist?

  • Those who consider themselves to be Definitely a data scientist are more likely also to believe about Model Bias and Fairness with those who do not consider themselves a data scientists making the last end of the spectrum with lowest They vs Not They Ratio!

Type of Data

  • Kagglers who use Genetic Data lead the table of giving importance to Model Bias
  • Kagglers who use Image Data and Video Data are least bothered

MOOC - Online course Platform

  • Ironically, Kaggle Learn combining with Coursera and Udemy top the places from where people who don’t think Model Bias is very important learn Data Science / Machine Learning
  • Kagglers who learn DS from Google Developers, Theschool.ai and Fast.ai are the ones who lead the thought process of thinking that Model Bias is very important

Percentage of Data Projects

  • As you can see above, There is a very very minimal demand in the workplace to explore unfair ML bias.
  • In fact, only ~ 1.2K respondents of the total ~13K respondents who respondended to this question, have answered to have a need of more than 50% to explore unfair bias in their data projects. (that’s less than 10%)

Model Interpretability / Explaining ML Models

Who are they?

They refer to those beings who think Being able to explain ML model outputs and/or predictions are Very Important in Machine Learning and Not They are those who think otherwise including - Somewhat Important, Not at all important and similar.

Gender

Age

  • The cohort between Age group 18 - 29 are almost half of the index value of the cohort in 60-69 which shows how age and experience play a vital role in their perception about Interpretable Machine Learning

R vs Python vs more

  • Unexpectedly, SAS/STATA tops the index of They-Not_they ratio followed by R and MATLAB - making these three language users on Kaggle believe Model Intepretability is very important.

  • Python users need to be made aware of IML as Python still catches up behind SQL and Julia.

  • Kagglers with Humanities background strongly believe in IML (Interpretable Machine Learning) with a TNT index of 2.28, followed by Engineerg (Non-CS) and Mathematics/Statistitcs
  • Ironically, Computer Science Engineers are no where near the top with the index of 1.1764

Work Industry

  • Kagglers from Insurance and Military/Defense industry background lead the table
  • As we’ve seen before, Students are no where close to perceive Interpretable Machine Learning is Very important
  • Industries like Marketing, Online Services are worse than Students in their perception

MOOC - Online course Platform

  • Kaggle Learn that has an exclusive course on Interpretable Machine Learning hasn’t managed to be on the top, only with the 5 position
  • Datacamp Kagglers have a strong feeling of Very important about IML
  • Unlike, Model Bias perception where Fast.ai was on the top, here it’s at the rock-bottom

Exploring model insights? - Percentage of Data Projects

ML models to be black boxes - Difficult to explain

The above plot tells us that Most people are confident that they can understand and explain the outputs of Many ML models but not all ML models. In fact, those who feel - Most ML models are Black Boxes are more than those who don’t have any opininon on this matter.

Recommendations